Skip to content

Conversation

@kmahar
Copy link
Contributor

@kmahar kmahar commented May 6, 2019

This adds two withBSONPointer and withMutableBSONPointer methods to Document that can be used to get access to immutable/mutable pointers underlying the document. I've switched to using those everywhere we were just directly using the bson_t pointer before.

Additionally, all errors are now passed in with withUnsafeMutablePointer instead of with &.

internal func toErrorString(_ error: bson_error_t) -> String {
var e = error
return withUnsafeBytes(of: &e.message) { rawPtr -> String in
return withUnsafeBytes(of: error.message) { rawPtr in
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

swift 4.2 introduced a non-inout version of this, so we no longer need to copy the error

@kmahar kmahar requested review from mbroadst and patrickfreed May 6, 2019 22:56
@kmahar kmahar force-pushed the SWIFT-411/explicit-doc-pointers branch from 2a3d1e5 to 328ae00 Compare May 7, 2019 19:28
@kmahar kmahar force-pushed the SWIFT-411/explicit-doc-pointers branch from 328ae00 to 877d5db Compare May 17, 2019 17:26
@kmahar kmahar changed the title SWIFT-411 Make pointer access to bson_ts and errors more explicit SWIFT-411 Make pointer access to bson_ts more explicit May 17, 2019
@kmahar kmahar requested a review from patrickfreed May 17, 2019 17:28
@kmahar
Copy link
Contributor Author

kmahar commented May 17, 2019

Ok round 2. we have a new withMutableBSONPointer helper used anywhere we mutate a document's storage.
also renamed .data to ._bson and .storage to ._storage.

try arr.setValue(for: String(i), to: val)
}

guard bson_append_array(storage.pointer, key, Int32(key.utf8.count), arr.data) else {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it occurred to me as I made these changes that we are kind of doing the same bad thing here where we are mutating this bson_t. but maybe it's overkill to do some kind of withPointer for the storage, too.
similar to DocumentIterator, DocumentStorage could probably use some rethinking...

Copy link
Contributor

@patrickfreed patrickfreed May 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. It's making me think we should just make DocumentStorage a private type (not even internal) and update all the public or non-Document stuff that uses it just take a Document instead. It seems too dangerous to circumvent the api you're adding to Document in this PR to warrant exposing the storage, even at the internal level.

Copy link
Contributor Author

@kmahar kmahar May 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered that.... it seems like it might confuse users that there would be this public method on BSONValue:
encode(to: Document, forKey key: String)
but that we don't actually intend for them to set values on documents that way....

then again, this existence of that method and DocumentStorage in the public API are already confusing as it is.

I wonder if DocumentStorage should get a bunch of methods on it: appendInt32, appendString, and so on... and then we type switch within Document (or in some generic appendBSONValue method on DocumentStorage) to decide which method to call?

that brings us back to same problem where BSONValue is a protocol but people can't really actually make arbitrary types conform to it since we wouldn't be switching on them there...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we call copyStorageIfRequired where necessary in setValue and we've updated the count property, the bad things we're doing here aren't resulting in any buggy behavior (I think). It would be preferable to guarantee that at the mutating call's site, but without big changes that'll be hard to do for the reasons you mention.

As far as big changes go though, I have a fun idea...

We would update DocumentStorage to be a wrapper around an internal Data buffer that we manage ourselves instead of a bson_t. We'd then update the BSONValue protocol to require a bson: Data property (or something like that) that emits a buffer containing the raw BSON bytes (and remove encode(to:forKey). Then, where we normally call encode, we instead get this buffer from the BSONValue and append it to our internal buffer (with the necessary bookkeeping for keys, types, etc). When we do need a bson_t, we can use bson_init_static to create a lightweight, read-only, and stack allocated bson_t from the internal buffer.

This gets us like 99% of the way to pure Swift BSON, avoids a lot of memory issues, and makes the BSONValue protocol more sensical. This is obviously out of scope for this ticket, but I think it could be a good approach for solving issues we have now as well as moving towards a pure Swift driver.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ha, when I started thinking about the pure Swift BSON library design I came up with something very similar where we would have a bytes property on BSONValue 🙂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes me realize that pure swift BSON library may actually be a breaking change, so we might have to do it pre-GA 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think we discussed this some time ago. This is what the C++ driver does, rather than keeping instances of bson_t everywhere. bson_t initialization isn't zero overhead, but it might not be that bad. I'd say go ahead and make a ticket for the feature

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed SWIFT-497

/// Returns the number of (key, value) pairs stored at the top level of this `Document`.
public internal(set) var count: Int
/// the storage backing this document.
internal var _storage: DocumentStorage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify: what is the convention around underscore naming / what do they signify now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read our conversation earlier as that we should use them for anything besides a public property. but I don't feel very strongly so I'm open to any other convention on this

Copy link
Contributor

@patrickfreed patrickfreed May 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think just internal vs public might be too broad. I think it'd be more useful for it to designate things we truly want as fileprivate or private. I know a lot of times I'll update some private field to internal so I can use it elsewhere, but doing that is not a great practice and leads to problems like what this PR is trying to solve. We could use the underscore naming as a way to say "keep this thing private".

That said, I too don't feel that strongly about this (and have introduced stuff that doesn't align with the above convention myself), so if that convention doesn't make sense I'm fine with not doing it. I do think that if we're going to name some stuff with underscores, we should settle on some convention around them, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, without a real convention on what the underscore means, I don't think it gives us any extra information. that seems like a reasonable convention to use, though yeah it definitely doesn't align with a lot of the driver now... lots of our _x properties are actually internal.

alternatively we can get rid of all of the underscores and say this is what swift's fine-grained ACLs are for.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both options are fine with me really.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afaict the underscore is used in Apple's code to indicate a fileprivate

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to hold up this PR for style nits, so it's not super important what happens to them here. I do think we should either revisit the convention in the future or remove all of them from the codebase if we find that such a convention isn't necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened a follow up ticket SWIFT-498

@kmahar kmahar merged commit 72ee41c into master May 21, 2019
@kmahar kmahar deleted the SWIFT-411/explicit-doc-pointers branch May 21, 2019 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants